Inventi Impact: Audio, Speech & Music Processing

Articles

Inventi:easm/45481/22

Using Voice Activity Detection and Deep Neural Networks with Hybrid Speech Feature Extraction for Deceptive Speech Detection

01-Jul-2022 Research 2022 : July-September

Serban Mihalache, Dragos Burileanu

In this work, we first propose a deep neural network (DNN) system for the automatic detection of speech in audio signals, otherwise known as voice activity detection (VAD). Several DNN types were investigated, including multilayer perceptrons (MLPs), recurrent neural networks (RNNs), and convolutional neural networks (CNNs), with the best performance being obtained for the latter. Additional postprocessing techniques, i.e., hysteretic thresholding, minimum duration filtering, and bilateral extension, were employed in order to boost performance. The systems were trained and tested using several data subsets of the CENSREC-1-C database, with different simulated ambient noise conditions, and additional testing was performed on a different CENSREC-1-C data subset containing actual ambient noise, as well as on a subset of the TIMIT database. An accuracy of up to 99.13% was obtained for the CENSREC-1-C datasets, and 97.60% for the TIMIT dataset. We proceed to show how the final VAD system can be adapted and employed within an utterance-level deceptive speech detection (DSD) processing pipeline. The best DSD performance is achieved by a novel hybrid CNN-MLP network leveraging a fusion of algorithmically and automatically extracted speech features, and reaches an unweighted accuracy (UA) of 63.7% on the RLDD database, and 62.4% on the RODeCAR database.

How to Cite this Article
Attribution/ CC Compliant Citation: Mihalache, Serban, and Dragos Burileanu. "Using Voice Activity Detection and Deep Neural Networks with Hybrid Speech Feature Extraction for Deceptive Speech Detection." Sensors 22.3 (2022): 1228. https://doi.org/10.3390/s22031228 http://creativecommons.org/licenses/by/4.0/ Some formatting elements, header, footer, logos, dates and pagination were modified while adapting this article.
Download Full Text

Call Us: +4 (800) 888-0008

Inventi Impact: Audio, Speech & Music Processing

Articles

Inventi:easm/45481/22

Using Voice Activity Detection and Deep Neural Networks with Hybrid Speech Feature Extraction for Deceptive Speech Detection

How to Cite this Article

Links

Contact Us